Background

A version of this page was originally created as an entry for a visualization contest at my workplace where it was one of three winning entries. My goals were to explore the unconventional use of the R package visNetwork for Knowledge Graph visualization, introduce the concept of Linked Data to my colleagues, and have fun along the way.

Introduction

My presentation explores the visualization of a simple row-by-column excel spreadsheet provided for the challenge. The data captures information about human study subjects assigned to clinical trials at clinics located at the specified latitude and longitude.

The benefits of converting relational data to Graph Data are well known and are beyond the scope of this discussion.

For people who are new to the technology, graph data can be confusing at first, eliciting feelings of Anger or Fear. Visualization helps facilitate understanding and can also be Artistic. For Data Nerds, the true beauty is found in the semantic inter-connectedness of the data itself.

Inspiration

British Television Series Spaced, S1E1

Daisy: What do you do, Brian?
Brian: I’m an artist.
Daisy: Oh… What kind of thing do you do?
Brian: Anger. Pain. Fear. Aggression.
Daisy: Oh. Watercolours?
Brian: It's a bit more complex than that.


“It's a bit more complex than that.” (Methodology)

Data is read from the spreadsheet using R and converted to graph data as Resource Description Framework (RDF).

The graph data schema replaces the traditional row-by-column representation with semantic relationships between the types of entities in the data. For example, a Person attends a Clinic, which is locatedIn a City. No other details were provided for the studies, so Phase was assigned directly to the Person, interpreted as “the person participates in a Phase n study.” The visualizations on this page focus on the people and the location of the clinics they attend.

R is used to both materialize the data into the graph model and for visualization.

Converted Data

A view of the data for Person_1 showing the structure as Linked Data in Terse Triple Language (TTL) syntax. Only the immediate data for the Person and the Clinic they attend is shown.

ucbat:Person_1
  a             ucbat:Person ;
  rdfs:label    "Person_1"^^xsd:string ;
  ucbat:age     "29"^^xsd:int ;
  ucbat:attends ucbat:CLINIC_7c89a32a ;
  ucbat:gender  ucbat:GEN_ac3000ea ;
  ucbat:phase   ucbat:PHASE_8b8fd9c6 .

ucbat:CLINIC_7c89a32a
  a               ucbat:Clinic ;
  rdfs:label      "Clinic_1"^^xsd:string ;
  ucbat:latitude  "38.900394"^^xsd:string ;
  ucbat:locatedIn ucbat:CITY_46069427 ;
  ucbat:longitude "-77.05126"^^xsd:string .

This Force Directed Network Graph of the data, without labels or interactivity, evokes feelings of anger and frustration.

The area below will appear blank until the data is processed by your browser. Please remain patient and do not refresh the view.


Interaction: Reposition image: Left mouse button hold and drag. Zoom in/out: Mouse scroll wheel.

Click here for Technical Details
Data source: DevSol.xlsx converted to Linked Data
R Package: visNetwork
Parameters: Physics = repulsion, with low damping to keep the plot in motion.

Don’t look away. Let the fear wash over you!” - David Lynch in Family Guy, How David Lynch Stole Christmas

Minimal labels (with mouseover) and misleading images only serve to evolve your anger into fear. Is this even the data? It is - which makes it all the more terrifying!


Interaction: Reposition image: Left mouse button hold and drag. Zoom in / out: Mouse scroll wheel. View node labels: Mouseover nodes.

Click here for Technical Details
Data source: DevSol.xlsx converted to Linked Data
R Package: visNetwork
Parameters: Randomseed layout with externally hosted images.
Bat image copyright Edward Gorey.

Data imitates art, with inspiration from Flower Garden by Gustav Klimt (1905) :

Data for this plot was subset to six patients(two male and four female ) to illustrate their connection to clinics , cities , states , and country .


Interaction: None. Observation only.

Click here for Technical Details

Data source: DevSol.xlsx converted to Linked Data
R Package: visNetwork
Other Software: GIMP

Image Preparation

  • GIMP was used to extract individual flowers from the source image and then apply the filter Artistic|Oilify to each flower. GIMP was also used to create the background by cloning a portion of the original green background over the entire image to remove original flowers.
  • Individual flower images were saved as transparent .PNG images and hosted on GitHub for access by the visNetwork plot. visNetwork cannot use local images for nodes with the exception of RShiny applications.
  • The plot was created from the data using R visNetwork. The package does not allow use of background images, so the plot was saved to a transparent .PNG and then overlaid on top of the prepared background image to created a static, non-interactive display.

Is this actual data? In the live visNetwork plot below, start at the top by hovering your mouse over a flower and trace the route down from Person to Country, hovering over the links and nodes to reveal their relationships.


Interaction: Node/Edge labels: Mouse over nodes/labels. Reposition nodes:Select node with left button, hold and drag. Zoom In/Out: Mouse scroll wheel.

The Real Beauty is in the Data

For graph nerds like myself, the true beauty lies in the ability to query along the semantic graph of data, without FEAR, ANGER, or Artistic Abstraction. The query language SPARQL was used to create all the network plots on this page, including the one below.


Interaction: Node/Edge labels: Mouse over nodes/labels. Reposition nodes:Select node with left button, hold and drag. Zoom In/Out: Mouse scroll wheel.

Click here for Technical Details
Data source: DevSol.xlsx converted to Linked Data
R Package: visNetwork
Parameters: Physics disabled. Minimal styling.